Replication File for "China's Ideological Spectrum"
The Journal of Politics, forthcoming.

By Jennifer Pan and Yiqing Xu


** Notes **

1. In the R scripts, several chunks of the code are commented out to
speed up the replication process (some of the procedures take a very
long time). The intermediate results are saved in the "output" folder
and can be found in the replication materials.

2. Please set the working directory to the folder where ReadMe.txt is
located.

3. Successful replication requires consecutively running the 
eight R scripts listed in ** R files ** (in the correct order).

4. If not printed on screen, tables (latex code) are stored in the
"output" folder; graphs are stored in the "graphs" folder.

5. The "maps" folder stores shape files that are used to produce
Figure 2.

6. Since Harvard Dataverse does not allow subfolders, files with
respective tags (i.e., "code", "data", "maps", and "output") should be
placed in respective folders (i.e., ",/code/", "./data/", "./maps/",
and "./output/" in which "." represents the main replication folder).

** Datasets **
 (in the "data" folder)

1. zuobiao_raw_frame.dta
-- Data frame of the raw zuobiao data. Data on the answers to the 50
zuobiao questions were removed per request of the zuobiao team, but
are available upon request.  Geographic
locations are determined by respondents' IP addresses. When the
location cannot be uniquely identified, we duplicate the data for each
possible location and put a weight ("ipwgt") on each duplicated
observation (weights sum up to 1).  "id1" uniquely identifies a
respondent.

(1) provgb:	provincial GB code; 81 = HK; 99 = overseas
(2) overseas:	1 = overseas respondent; 0 = otherwise
(3) gender: 	0 = female; 1 = male
(4) birthyear:	birth year
(5) educ:	education; 1 = below high school; 2 = high school; 3 = college; 4 = above college
(6) income:	annual income: 1 = 0-25k; 2 = 25-50k; 3 = 50-75k; 4 = 75-100k; 5 = 100-150k; 6 = 150-300k; 7 = > 300k; 1.5 = 0-50k; 3.5 = 50-150k
		
2. zuobiao1214_wgt_frame.dta
-- Data frame of the zuobiao data with population-based weights (based
on gender, birth year, and province according to the 2005 inter-census
population survey). Observations from Hong Kong, Tibet, Ningxia,
Qinghai, and abroad are dropped

3. abs_short.dta
-- Asian Barometer Survey data with selected questions

4. prov_corr.dta
-- provincial-level indicators

5. abs_mi.RData
-- imputed ABS data via multiple imputation to deal with missing data

6. sample10K.RData (main data file)
-- a 10,000-observation sample of the zuobiao data


** R files **

 (in the "code" folder)
Please run these scripts in order:

1_toyexample.R
-- create toy examples
-- Output: Figure 1

2_stats.R
-- Descriptive statistics
-- Output: Figures 2, 3, Tables A1, A2

3_pca.R
-- Conduct principal component analyses (PCA)
-- Output: Figure 4

4_cfa_search.R
-- Search for the optimal CFA model
-- Output: Table 1, Figure 5

5_cfa.R
-- Conduct confirmatory factor analyses (CFA)
-- Output: Table A3, Figures 6-9, A1-A2

6_cfa_ind.R
-- Analyze individual-level correlations
-- Output: Figures 10 (upper), 11 (upper)

7_cfa_prov.R
-- Analyze provincial-level correlations
-- Output: Figures 12, A3

8_abs_cfa.R
-- Conduct CFA with the ABS data	
-- Output: Table A5, Figures 10 (lower), 11 (lower), A4, A5.


** Citation **

Jennifer Pan and Yiqing Xu. 2017. "China's Ideological Spectrum.” The
Journal of Politics, forthcoming.

@article{PanXu2017,
author = {Pan, Jennifer and Xu, Yiqing},
journal = {The Journal of Politics},
title = {{China's Ideological Spectrum}},
volume = {(forthcoming)},
year = {2017}
}
